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^ ; Abstract 

^ ' An efficient Bayesian inference method for problems that can be mapped onto dense graphs is pre- 
^\ • sented. The approach is based on message passing where messages are averaged over a large number of 
replicated variable systems exposed to the same evidential nodes. An assumption about the symmetry of 
the solutions is required for carrying out the averages; here we extend the previous derivation based on 

■ a replica symmetric (RS) like structure to include a more complex one-step replica symmetry breaking 

d . (IRSB)-like ansatz. To demonstrate the potential of the approach it is employed for studying critical 

y , properties of the Ising linear perceptron and for multiuser detection in Code Division Multiple Access 
X3 ■ 

C5 ■ (CDMA) under different noise models. Results obtained under the RS assumption in the non-critical 

o ; 

regime give rise to a highly efficient signal detection algorithm in the context of CDMA; while in the 
, critical regime one observes a first order transition line that ends in a continuous phase transition point. 

■ Finite size effects are also observed. While the IRSB ansatz is not required for the original problems, 
CN ■ it was applied to the CDMA signal detection problem with a more complex noise model that exhibits 
f — RSB behaviour, resulting in an improvement in performance. 

o , 
l> . 
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I. INTRODUCTION 



Efficient inference in large complex systems is a major challenge with significant implications in 
science, engineering and computing. Exact inference is computationally hard in complex systems 
and a range of approximation methods have been devised over the years, many of which have 
been originated in the physics literature 1|. A recent review ^ highlights the links between the 
various approximation methods and their applications. 

Approximative Bayesian inference techniques arguably offer the most principled approach to in- 
formation extraction, by combining a rigorous statistical approach with a feasible but systematic 
approximation. Although message passing techniques have existed for some time in the computer 
science community ^,0] they have enjoyed growing popularity in recent years [sl, mainly within 
the context of Bayesian networks and the use of Belief Propagation (BP) for a range of inference 
applications, from signal extraction in telecommunication to machine learning. 

The main advantage of these techniques is their moderate growth in computational cost, with 
respect to the systems size, due to the local nature of the calculation when applied to sparse 
graphs. Until recently, message passing techniques were deemed unsuitable for inference in 
densely connected systems due to the inherently high number of short loops in the corresponding 
graphical representation, and the large number of connections per node, which results in a high 
computational cost. Both properties are considered prohibitive to the use of conventional message 
passing techniques in such problems. 

A recently suggested method for message passing in densely connected systems ^ relies on 
replacing individual messages by averages sampled from a Gaussian distribution of some mean 
and variance that are modified iteratively. The method has been applied for the CDMA signal 
detection inference problem; it successfully finds optimal solutions where the space of solutions 
is contiguous but breaks down when the solution space becomes fragmented, for instance, when 
there is a mismatch between the true and assumed noise levels in the CDMA detection problem. 
The emergence of competing solutions gives rise to conflicting messages that result in bungled 
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average messages and suboptimal performance. In statistical physics terms, it corresponds to 
the rephca symmetric solution in dense systems {3| and gives poor estimates when more complex 
solution structures are required. _ 

In the current paper, we methodologically extend the approach of Kabashima [6|| for inference 
in dense graphs by considering a large (infinite) number of replicated variable systems, exposed 
to the same evidential data (received signals). Each one of the systems represents a pure state 
and a possible solution. The pseudo posteriors, that form the basis for our estimates, are based 
on averages over the replicated systen,s. The method ha. been en,ployed previously only in 
the non-critical regime [8j|, using the most basic (RS-like) ansatz for the solution structure. 
In the current paper we study both critical and non-critical regimes and extend the solution 
structure considered to include step replica symmetry breaking (IRSB) like structures Q]. To 
demonstrate the potential of this approach and the performance obtained using the resulting 
algorithm we apply the method to two different but related problems: signal detection in Code 
Division Multiple Access (CDMA) and learning in the Ising linear perceptron (ILP). 

We investigate both RS and IRSB-like structures. The former is applied to both CDMA and ILP 
problems and seems to be sufficient for obtaining optimal performances; the latter is applied to 
a variant of the CDMA signal detection problem with a more complex noise model that exhibits 
RSB-like behaviour, to demonstrate its efficacy for particularly difficult inference tasks. 

In section [Til we will introduce the general models studied, followed by a brief review of message 
passing techniques for dense systems in section lllli The general derivation of our approach, for 
both RS and RSB-like solution structures, will be presented in section HVl numerical studies of 
both CDMA signal detection and ILP learning will be reported in section |Vl To demonstrate 
the method based on the more complex IRSB solution structure, and to examine its efficacy to 
problems that require such structures, we will introduce a variant of the CDMA signal detection 
problem and study it numerically in section |VTl We will conclude the presentation with a 
summary and point to future research directions. Details of the derivation will be provided in 
Appendices lAllEl 
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II. MODELS STUDIED 



Before describing the inference method, the approach taken and the algorithms derived from it, 
it would be helpful to briefly describe the exemplar inference problems tackled in this paper. 

We apply the method to two different but related inference problems: signal detection in CDMA 
and learning in the Ising linear perceptron (ILP). Both correspond to inference problems where 
data points are noisy representations of sums of binary variables modulated by random binary 
values. 

Multiple access communication refers to the transmission of multiple messages to a single re- 
ceiver. The scenario we study here, described schematically in figure [I](a), is that of K users 
transmitting independent messages over an additive white Gaussian noise (AWGN) channel of 
zero mean and variance a^. Various methods are in place for separating the messages, in partic- 
ular Time, Frequency and Code Division Multiple Access 1^. The latter, is based on spreading 
the signal by using K individual random binary spreading codes of spreading factor A^. We con- 
sider the large-system limit, in which the number of users K tends to infinity while the system 
load [3 = K/N is kept to be 0{1). We focus on a CDMA system using binary phase shift keying 
(BPSK) symbols and will assume the power is completely controlled to unit energy. The received 
aggregated, modulated and corrupted signal is of the form: 



where hk is the bit transmitted by user fc, s^k is the spreading chip value, is the Gaussian 
noise variable drawn from Af (0, 1), and the received message. The task is to infer the original 
transmission from the set of received messages. This process is reminiscent of the learning task 
performed by a perceptron with binary weights and linear output, which is the next example 
considered in this paper. 

Learning in neural networks has attracted considerable theoretical interest. In particular we 
focus on supervised learning from examples, which relies on a training set consisting of examples 
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Figure 1: Schematic representation of (a) the CDMA system, (b) the ILP. 



of the target task 



We consider a perceptron, described schematically in figure [T](b), which 



is a network that sums a single layer of inputs Sf^k with synaptic weights bk and passes the result 
through a transfer function 

where g is typically a non-linear sigmoidal function. If g{x) = x the network is termed linear 
output perceptron. If the weights € {±1} the network is called Ising perceptron. Learning is a 
search through the weight space for the perceptron that best approximates a target rule. 
The similarity between the linear perceptron of equation ([2]) and the CDMA detection problem 
of Eq.([T]) allows for a direct relation between the two problems to be established. The main 
difference between the problems is the regime of interest. While CDMA detection applications 
are of interest mainly for non-critical low load values, ILP studies focused on the critical regime. 
We consider both regimes in this paper. 
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III. MESSAGE PASSING FOR INFERENCE IN DENSELY CONNECTED SYSTEMS 



Graphical models (Bayes belief networks) provide a powerful framework for modelling statistical 
dependencies between variables 0, 0, They play an essential role in devising a principled 
probabilistic framework for inference in a broad range of applications. 

Message passing techniques are typically used for inference in graphical models that can be 
represented by a sparse graph with a few (typically long) loops. They are aimed at obtaining 
(pseudo) posterior estimates for the system's variables by iteratively passing messages (locally 
calculated conditional probabilities) between variables. Iterative message passing of this type is 
guaranteed to converge to the globally correct estimate when the system is tree-like; there are 
no such guarantees for systems with loops even in the case of large loops and a local tree-like 
structure (although message passirig_^techniques have been used successfully in loopy systems, 
supported by some limited theory [12|). A clear link has been established 



sage passing al gori thms and well known methods of statistical mechanics 
approximation [isl . \l\ . 



aetween certain mes- 
2fl such as the Bethe 



These inherent limitations seem to prevent the use of message passing techniques in densely con- 
nected systems due to their high connectivity, implying an exponentially growing cost, and an 
exponential number of loops. However, an exciting new approach has been recently suggested 
for extending BP techniques to densely connected systems. In this approach, messages 

are grouped together, giving rise to a macroscopic random variable, drawn from a Gaussian dis- 
tribution of varying mean and variance for each of the nodes. The technique has been successfully 
applied to CDMA signal detection problems and the results reported are competitive with those 
of other state-of-the-art techniques. However, the current approach has some inherent limita- 
tions presumably due to its similarity to the replica symmetric solution in the equivalent 
Ising spin models jj, [31- 

n . 

In a separate recent development [15|, the replica-symmetric-equivalent BP has been extended 
to Survey Propagation (SP), which corresponds to one-step replica symmetry breaking in di- 
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luted systems. This new algorithm, motivated by the theoretical physics interpretation of such 
problems, has been highly successful in solving hard computational problems [l^, far beyond 
other existing approaches. In addition, the algorithm facilitated theoretical studies of the corre- 



sponding physical system and contributed to our understanding of i_t 
recently been modified to handle Ising and multilayer perceptrons 
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16|. The SP algorithm has 



IV. GENERAL FORMALISM 



E 



We recently presented a new approach [8l| for inference in densely connected systems, which 
was inspired by both the extension of BP to densely connected graphs and the introduction of 
SP. The systems we consider here are characterised by multiplicity of pure states and a possible 
fragmentation of the space of solutions. To address the inference problem in such cases we 
consider an ensemble of replicated systems where averages are taken over the ensemble of potential 
solutions. This amounts to the presentation of a new graph, where the observables are linked 
to variables in all replicated systems, namely B = (b^, b^, . . . , b"); where b^ = {h\, &|, . . . , fe^)^, as 
shown in figured To estimate the variables B given the data = ?/2, • • • , Z/Af), in a Bayesian 
framework, we have to maximise the posterior P (B|y) oc n^=i P (z/ajIB) P (B) , where we have 
considered independent data, and thus P (y|B) = ]^^^-|^ P (y^|B). 

The likelihood so defined is of a general form; the explicit expression depends on the particular 
problem studied. Here, we are interested in cases where b G {±1}'^ is an unbiased vector and 
P (B) = 2"^". The estimate we would like to obtain is the maximiser of the posterior marginal 
(MPM) hk = argmaxj^^g|_^]^|n j P (B|y) , which is expected to be a vector with equal 

entries for all replica b\ = b\ = ■ ■ ■ = hi. The number of operations required to obtain the full 
MPM estimator is of O (2^) which is infeasible for large K values. 

To obtain an approximate MPM estimate we apply BP message passing technique 0, Q, Q|. In 
particular we are interested here in the application of BP to densely connected graphs, similar 
to the one presented in Q|. The latter is based on estimating a single solution and therefore does 
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Figure 2: Replicated solutions B = (bi, b2, -jb^) given data. 



not converge, as has been observed, when the solution space becomes fragmented and multiple 
solutions emerge. This arguably corresponds to the replica symmetry breaking phenomena and 
occurs, for instance, when the noise level is unknown in the CDMA signal detection case. 
A potential algorithmic improvement is achieved by the introduction of an SP-like approach, 
based on replicated variable systems, similar to the approach taken in problems that can be 
mapped onto sparsely connected graphs. 

Using Bayes rule one straightforwardly obtains the BP equations: 



For calculating the posterior P(y|B) , we assume a dependency of the data on the parameters 
of the form = (^^i £ ^iihf, . where JF is some general smooth function, 7 are model 
parameters and e^i are small enough to ensure that Yld=i^iJ.i^ ~ ^(1)- We define the vector 
= Si=i ^ni^i = Hi^k ^t^i^i + ^Mfebfe = A^fe + £^febfe. Thus, using y^^J^ (A^^ + e^kW,j) we 




(3) 
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can model the likelihood such that 



P(b„|B) = /'dA„iP(;/„,A„i|B;7) 

^ /dA^Pfe,|A^,B;,)P(A^|B) 

= y"dA^fcP(l/^|A^fc + £^fcbfc;7)P(A^fc|B) 

~ J dA^k [l + £;.febJVA,,lnP(y^|A^fe;7)] P(y^|A^ifc;7) P(A^fe|B) , (5) 

where we have assumed that P {y^\A^k, B;j)^P (i/^l A^^ + ^luk^k, l), 
due to the assumed dependence of the observed values on and b^. 

A. Inter-replica correlations 

An explicit expression for inter-dependence between solutions is required for obtaining a closed 
set of update equations. We assume a dependence of the form 



P' (bfe \{y.^,} ) oc exp jh^ b, + ^bjQ*^, b,| , 



(6) 



where h^^^ is a vector representing an external field and Q^^, the matrix of cross-replica interaction. 
The form of Q^^ depends upon the particular case considered. We assume one of the following 
symmetry relation between the replicated solutions: 

(h*.)'' = Kk^ and 
r^QUr' -S-'ql,+ {l-S-')ql, or 

where ^ is a block index that runs from 1 to L and 'a' is a intra-block replica index that runs 
form 1 to n where n is the number of variables per block. We also make the following reasonable 
assumption ^q^^ > gj^^ > ^2/^*: > 0, as one expects correlations to gradually decrease between 
variables with non-identical replica and block indices, respectively. 
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For both types of symmetries considered, the correlation matrix defined as: 



a' \ 



where I is an index or a pair of indices for RS and IRSB, respectively. The correlation matrix 
is assumed to be self-averaging, i.e. Y^^/^ ~ T* and preserves the symmetry of the matrix Q^;,. 
An explicit derivation of the entries of T* is presented in Appendices [A] and [Bl for the RS and 
RSB-like correlation structures, respectively; the matrices take following the general form: 



^(IRSB) yt^ 
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X' + 1 - b"^ -V 



n 



Thus, for the appropriate centre of the distribution u|^^ (see equations (IA8I) and (IBlSjl ). the 
probability of A^^ can be expressed as: 



^(A^felB) 



(27r)" dct (T* 



■ exp 



J d-d exp 



cx < 



-n 



2i?* 

n 
~2 



U2=i exp < 



2X* 



+ 



U2=i exp < 



(41 - K'( 



for the RS and RSB-like correlation matrices, respectively, where 7?°f = + -(9^ + and 



B. Messages 

Having obtained the conditional probability distribution P (A^^IB) one can then derive explicit 
expressions for the messages m^^ (magnetisation) and fh^k that can be viewed as parameters 
in the corresponding marginalised binary distributions {yfi\bk, {yuj^fi}) oc (1 + rh}^J}k) 1 2 and 
P' (&fclW}) = (l + <A)/2. 
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The messages from nodes to nodes b^, as derived in Appendix ICl equations (IClll - (IC8 



m 



t+i 

Ilk 



U 



Ilk 



R 



(RS) 
(RSB) 



(8) 



where Vj = , V is defined in equation ( 1C3I1 and -i?^^ is obtained from the saddle point 

equations given by equation (IDll) in the RS case and by equation (ID2p in the IRSB case. The 
messages from nodes to are given in both cases by the expression m^^ ^ tanh (^^^^ ^^kj ■ 
For the gauged field bkh'^^f^ where h^^^ = artanh (m^J = E^^;. artanh (ml^ J ~ T^u^ii^lk- The 
distribution of this field is well approximated by a Gaussian as a result of the central limit 
theorem. The mean and variance of the Gaussian are and F* respectively: 



^ K N 
k=l At=l 



(9) 



N 



k=l \ k=l / 



K N 



k=i fi=i 



Both E^ and F* are assumed to be independent of the index /i by virtue of the self-averaging prop- 
erty. For the same reason we expect the macroscopic variables defined as M* = Ylk=i ^fc^^^i^/-^ — 
Y.k=i^km\/K = M* and iV* ^ Ef=i K.)V^ ^ Ef=i K) = N\ where m* ^ 
tanh {^^^i ^ikj 1 to be independent of the index /i. Thus, these macroscopic variables can 
be evaluated by the following integrals 



Vu tanh VF*m + F' 



AT* 



Dm tanh 



+ E' 



where Vu = exp (— /a/^tt 
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C. Optimisation 



The structure of the correlation matrix used introduces free variables in the form of the correla- 
tion terms between replicated solutions. These are used for optimising the estimation provided 
with respect to a given performance measure. 

Since the MPM estimator is given by hi = sgn (m|,) ~ sgn (m^^) = sgn (/i^^) , the expression for 
the error per bit rate takes the form: 

1 ^ 

^^^ = ^E(i-^g^(^^"^U) ' (10) 

k=l 

which is minimised when the true message vector b and the vector of messages m* are parallel. 
Therefore, the error rate per bit decreases as the ratio M*/v^iV* = cos ^bm*j increases. The 



OE^ 1 dF^ 

optimal value is reached when E^ (7^^) = F* (7^^) and 
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as derived in 



CDMA AND LINEAR ISING PERCEPTRON 



Using this notation one defines e^k = s^k/vN for the CDMA problem and e^k = s^k/vK for 
the Ising perceptron. The goal is to get an accurate estimate of the vector b for all users given the 
received message vector y via a principled approximation of the posterior P(b|y). An expression 
representing the likelihood is required and is easily derived from the noise model (assuming zero 
mean and variance cr^). If the arithmetic variance over replicas of the macroscopic message A^f. 
is finite and independent of the sub indexes and k, i.e. = ^ J2s, (^/ife)^ ~ in X^a^/lfc)^ ^ 
00 V/x/c, then P(y^|B) can be expanded as 



P - \/ exp <^ ^ ^-^^ ^ ;> 1 + -^b, (y^ - ^^k> 



a' 



fill 
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nL 

, ^ s 

where = y^u and =(1, 1, ■ ■ ■ , 1). The function V {§,y^, defined in equation f lC4l) . and 
obtained from this distribution is linear in therefore, the second derivative used for calculating 
the messages in equation ([8]) = and the corresponding structure of the correlation matrix is 
RS-like. 

To calculate correlations between replica we expand P(?/^|B) in the large N limit in (fTTl) . as 
shown in equation ([5]). According to the RS correlation assumption, the macroscopic variables 
satisfy the following relation: 



X* ~ 62 (1 - X*) 



where ei = I {13) for the CDMA (ILP) system and 62 = (3 {I) for the CDMA (ILP) systems, 
respectively, due to the change in scaling. The saddle point equation (1C6I 1 provides a dominant 
value for the variable 

A. Messages 

The message from to bl at time t + 1 is then given by: 



The main difference between equation (IT2|) and the equivalent equation in [6| is the dependence 
of the pre- factor on i?*, reflecting correlations between different solutions groups (replica). To 
determine this term we optimise the choice of cr^ by applying the condition = F^. Forcing this 
condition leads to a relation between the structure of the space of solutions, represented by i?*, 
and the free parameter of the model cr^. From equation (fT2l) and using = F* and M* = X* 
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one obtains: 

which imply, after simplification, that for both cases = — a'^. Despite the simplicity of this 
result, the process from which we obtained it provides us with a practical way to estimate the true 
noise variance. Notice that for calculating and F* we used the limits K,N ^ cxd with K/N = 
j3. So that ctq, which appears in the expression for F*, can be obtained from the signal vector of 
Un with an infinite number of entries. Thus 



Using this expression we can finally express the message as: 



- ^i^k ^ , (13) 



N 



where no prior belief of a is required. 



B. Steady state and critical analysis 



The steady state equations for the macroscopic variables and E* are obtained by taken the 
limit t ^ oo. Let us define N = limt^^N^ and E = lim^^ooF*. In the asymptotic regime the 
following relations hold: 

N {al P) ^ jvu tanh^ (^^E {al(3)u + E {al (5)^ (14) 
and from these expressions one can obtain the full expression for the error per bit rate: 
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(15) 
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Figure 3: (a) Error probability of the inferred solution evolving in time. The system load (3 = 0.25, 
true noise level ctq — ^'•25 and estimated noise cj^ = 0.01. Squares represent results of the original 
algorithm ^, solid line the dynamics obtained from our equations; circles represent results obtained 
from the suggested practical algorithm. Variances are smaller than the symbol size, (b) a measure 
of convergence for the obtained solutions, as a function of time; symbols are as in the main figure. 

C. CDMA signal detection - numerical results 



The inference algorithm requires an iterative update of equations (IC9|13p and converges to 
a reliable estimate of the signal, with no need for prior information of the noise level. The 
computational complexity of the algorithm is of Oi^K"^). 

To test the performance of our algorithm we carried out a set of experiments of the CDMA 
signal detection problem under typical conditions. Error probability of the inferred signals was 
calculated for a system load of /5 = 0.25, where the true noise level is ctq = 0.25 and the estimated 
noise is (T^ = 0.01, as shown in figure[3](a). The solid line represents the expected theoretical results 
(density evolution), knowing the exact values of and a^, while circles represent simulation 
results obtained via the suggested practical algorithm, where no such knowledge is assumed. 
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The results presented are based on 10^ trials per point and a system size = 2000 and are 
superior to those obtained using the original algorithm 
Another performance measure one should consider is 

that provides an indication to the stability of the solutions obtained. In figure [3](b) we see that 
results obtained from our algorithm show convergence to a reliable solution in contrast to the 
original algorithm [6||. The physical interpretation of the difference between the two results is 
assumed to be related to a replica symmetry breaking phenomenon. 



D. Ising linear perceptron - numerical results 

For the ILP, the K > N regime of high interest as the system develops a critical behaviour for a 
range of cTq values. We carried out a set of experiments for this system based on density evolution. 
In figure [4](a) we present curves of the bit error probability Pb, defined in equation (fTSl) . as a 
function of the inverse load for different values of cTq. Three different regimes have been 
observed: For ctq < 0.1025 the curves exhibit a discontinuity at a value of P that varies with ctq 
(first order phase transition-like behaviour). At cTq = 0.1025 the curve becomes continuous but 
its slope diverges (second order phase transition-like behaviour). The Pb curves show analytical 
behaviour for noise values above 0.1025. Figure[4](b) exhibits a phase diagram of the ILP system; 
it shows the dependency of the critical load /?(^^ as a function of the noise parameter. The 
first order transition line ends in a second order transition point marked by a circle. The results 
obtained, and in particular the critical /3 value, are consistent with those derived using the replica 
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symmetric statistical mechanics-based analysis of the problem 
Another indication for the critical behaviour is the number of steps required for the recursive 
update of equation (|T4l) to convergence. In figure [5](a) we present the number of iterations 
required to reach a steady state as a function of j3~^ when the noise parameter is set to ctq = 0.1. 

16 



Figure 4: (a) The error probability Pb at the steady state, equation (fTSjl . as a function of for different 
values of the noise parameter. For values of below 0.1025 the curves show discontinuity at certain 
P values, which becomes continuous but non- analytic at erg = 0.1025 around /S^^ ~ 0.68. For noise 
variance values above = 0.1025 the curves become analytical, (b) Position of the non analyticity of 
the error rate curve as a function of the noise parameter cjg. This first order phase transition-like 
curve ends in a second order phase transition-like point marked by (o). 

The number of iterations diverges when the critical value of P is reached. 

Finally, we wish to explore the efficiency of the algorithm as a function of the system size. 
In figure [5](b) we present the result of iterating equations fICQl) and (fT3l) for a system size of 
if =500. The curve presents mean values and error bars over 1000 experiments. There is a strong 
dependency of the error per bit rate on the size of the system, which is expected to converge to 
the asymptotic limit (infinite system size) represented by the solid line. 
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Figure 5: (a) Number of iterations of equation (fT4l) required for convergence as a function of /3, for 
ctq = 0.10; one clearly identifies the /3 value where the error rate curve exhibits a discontinuity, (b) 
Finite size effects are observed at all /3 values. The noise level used is Uq = 0.10 with K = 500. The 
curves provide mean values and error-bars over 1000 experiments. The solid curve obtained from the 
iteration of the steady state equations is presented as a reference. 

VI. CDMA SIGNAL DETECTION WITH DUAL-PEAKED GAUSSIAN NOISE 



To demonstrate the suitability of the method for more complex inference problems that require 
a system with IRSB-like structures, we will consider the CDMA signal of equation ([T]) where the 



noise is drawn from a bi-Gaussian distribution: 



P M = 7^ exp 



1+ro 1 



2tt 



exp 



(n^ - eo/cToY 



(16) 



where tq G (—1,1) represents the bias and ±£:o/o"o the positions of the Gaussian peaks. We 
consider the particular case where |£o/o"o| ^ so that the Gaussian peaks are slightly off centre. 
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For this model the hkelihood expression takes the form: 



1=1 a=l 



2(t2 



1 + r 
H — exp 



2a2 



where r, e and are estimates of the true parameters tq, and Uq. 

To derive the messages in this case we first calculate the function V {'d,y^) of equation (IC4I) . 
which has the form: 



■ tanh ( e-^^ 



+ arctanh(r^ 



where X* = /5 (1 - A^*) . 

Following the derivation of Appendix [Cj the saddle point equations f IDll) and ( 1D2I) can be 

expressed as: 



Ilk 



a2 + X* 



tanh I + arctanh(r 



a2 + X 



(t2 + X* 



tanh e 



+ X* + ly* (t2 ^ X* a2 + X* + ly* \ a2 + X 
Pw {yfM - ""^fc) + ^ (Po - Pw^) tanh (ez + arctanh(r)) 
zo + r Apw e+ {l- r^) Apw z 



- + arctanh(r) 



1 



-r (1 - r^) ApvK ^^^^ - o (1 - (1 - 3^^) ^P^/ 

where we denote M/* = i?* for the RS case and = 2V* - for the IRSB case, z = 
Pa = ((j2 + X* + 2o = pw {yfi - u^fik) and Apw = po - pw- 

The solution of this equation provides, up to order O (e^), 

1 



(e) c:^ zo + r Apw e + {l - r^) Apw 



zoe^ + r {Apw - ^o) + (l - Sr^) zq Ap 



19 



The function V and its two first derivatives at the saddle point value are: 

Po = -r [1 + (1 - r^) ApH/ e^] pw e + 

+ [l-[l-r^)pl,e^-[l- r^) (l - Sr^) Ap^/ plr e'] {y, - + 
+r (1 - r^) p3, - ul,)'e' + r^) (l - 3r^) p^. (y, - 

Pi ~ -Po + 

P2 = 2p3 (1 - r^) [re^ + (i _ Sr^) p^^ (y^ - u'^,) e'] , 

therefore, one can obtain the following expression, required for calculating the messages in the 
IRSB case (El 



1 VoV 



(1 - r^) poApy [re' + (l - 3r') pw {y, - u^,) e'] 



2 1 - ViV 

where Apy = — py. This straightforwardly leads to the following expression for the message: 
^^^^^^-S^ = ^{-[pw+{l- r') (r„ - p^) e^] re+ 



+PW [1 - (1 - r') PW s'-{l- r') (1 - 3r') (T, - p^,) e'] {y, - + 
+r (1 - r^) p3,.3 _ ^t^^Y + 1 (1 _ ,2) (1 _ 3^2) ^4 _ ^*^)3| ^ ^^^^ 

where T„ = po (piy — \^Pv)- The expression for the message in the RS case is recovered from 
equation (flTll in the limit ^ oo. 



A. Optimisation and messages 

Calculating the expressions for the macroscopic variables E*^^ and used in the optimisation 

process, requires performing the following sums, in the limit of i^, ^ oo with K/N = [3 < oo: 

N K , 



k=l 
fi k=l 
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where j = 0, . . . , 3 and I = 0, ... ,4. From the definition of the signal ^ and the expression 
for the noise (HI) we find that Aq = 0, Ai = 1, A2 = 2Bi, A3 = 3B2, Bq = 1, Bi = tqEq, 
B2 = /? (1 - 2M* + iV*) + ^2 + el, B-i = Bi {W2 - 2el) and B^ = 35| - 2e^. The explicit 
expressions derived for the macroscopic variables are: 

E'+' = Pw-{l-r^) Pw + 2r (1 - r^) B.p^, - (l - r^) (l - 3r^) [r„ - (1 + ^ap,) p^] pw e' 
= S2p'^-2r5ip2^£ 

+ [r^ - 2 (1 - r^) Sspiy] p^ - 2r (l - r^) 5i [r„ - (2 + ?>B2Pw) Pw] Pw + 
+ (1 - r^) [2r2 (r„ - pI.) pw+{l- Sr^) ^2 {Spl^ + 252p^ - 2r„) p^] . 

Applying the optimisation conditions of Appendix [El (7'^) = F* (7^^) and 
= 0, where 7^ = (r, £,0"^, ^) one obtain the following conditions: 

7f 



2 F* 97. 



re = Bi + r(l-r^) ^° ^ ^ " . (19) 

In the IRSB case one can further simplify these expressions by a suitable choice of and the 
number of replicas per block n. Optimisation with respect to the latter results in 

1 = 52P0 (^1 - Ib2Apv^ , (20) 

which implies 

(X* + a')' {al - a') 



that by definition is larger than zero. This condition is satisfied if our estimate for the noise 
variance is smaller than the true parameter (a^ < a^). In this case the number of replicas per 
block has to satisfy the condition 



n 



l<n<f{X';ala^) 



- (X^ + a^) (ai - a^y 
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Interestingly this ties the noise level mismatch to the number of replicas, thus giving further 
insight to the role played by the structure of the inter-replica correlation matrix. 
For < X*, the minimum value of / (X*; cig, cr^) is reached at X^in = max (0, cTq — 2cr^). It is 
also possible to prove that 4 < / [Xmin] o"o' '^^) • Although and n will not be explicitly used in 
the following expressions, the correct choice of the value for these parameters allows one to use 
equations (fTSl) and (ITOll in order to find the final expression for the macroscopic variable i?*"*"^, 
where no estimates are needed for the noise parameters: 

(lRSB)^t+l _ 1 

B2 — Bf 

Note that in the RS case we do not have the freedom to choose the number of replicas per 
block, given that this case is equivalent to take n — > cxd in the absence of the additional replica 
I = 1, . . . , L. For this reason equations ( ITSl l and ( ITOll and (fT9ll take the form: 

= b;^bi-bi^^'-'^ ^^'^ 

re = Bi+r{l-r^) lz^^3^ (22) 
and the macroscopic variable 

(RS) pt+1 _ (IRSB) pt+l I 2-B^ (g^ - Bf) / B2 _ 

^ Bl \X^ + a^ 
which depends on both estimates of the noise variance and bias e. 

Given that the algorithm deals with finite signal vectors {N < 00), the quantities Bi and B2 
have to be approximated by the correspondent finite sums. Therefore, we have: 

1^1^ 1 ^ _ 

= ,^^Nj:Kj:iy^—u)^Nj:y^^Bi (23) 

At=l k=l fJ.=l 



N K N 

At=l fc=l fi=l 



where we used the fact that limiv /^^oo J2fi. k ^a^^ ~ ^' Observe that no information about the 
true noise has been used to derive these expressions. 
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Having the estimates (l23l) we can write down the messages explicitly: 



1 ^2 \B2 b; 

Bi{ e'-Bl) ^ ^^^^(e^-Bl){e^-3Bl 



{y^^ - u^kY + o =4 {y^^ - ^Uf 



B^ 



2 r 2 

B, + 2^^^ (y, - ul,) 

'2 1 B2 

which can be now used recursively for obtaining the inferred solutions for this problem. Notice 
that an estimate of both e and a in required in the RS case. 



B. Numerical results 



To test the performance of the IRSB algorithm we carried out a set of experiments of the 
CDMA signal detection problem with bi-Gaussian noise. The results shown in figure[6](a) describe 
the error probability of the inferred signals as a function of the number of iterations has been 
calculated using both RS and IRSB-like correlation matrices for the case of parameters mismatch. 
The system load used in the simulations was /5 = 0.25, the true noise level ctq =0.25, Gaussian bias 
of £0 = 0.06 and weight tq =0.6. The estimated noise parameters are ct^ = 0.01 and e = 0.2. The 
circles represent simulation results obtained via the IRSB algorithm while the squares correspond 
to the RS-like structure. The results presented are based on 10^ trials per point and a system size 
A^ = 1000; error-bars are also provided. The results obtained using the IRSB-like structure are 
superior to those obtained using the RS algorithm. As shown in figure [6](b) using the stability 
measure D*, both RS and IRSB-based algorithms converge to reliable solutions; the IRSB-based 
algorithm is slightly slower to converge, presumably due to the more complex message passing 
scheme. 
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Figure 6: (a) Error probability of the inferred solution evolving in time, for the bi-Gaussian noise case. 
The system load /? = 0.25, true noise level ctq = 0.25 and estimated noise = 0.01. Squares represent 
results of the RS algorithm and circles represent results obtained from the IRSB algorithm, (b) D*, 
a measure of convergence in the obtained solutions, as a function of time; symbols are as in the main 
figure. 

VII. CONCLUSIONS 



We present and methodologically develop a nev^^ algorithm for using BP in densely connected 
systems that enables one to obtain reliable solutions even when the solution space is fragmented. 
The algorithm relies on the introduction of a large number of replicated variable systems exposed 
to the same evidential nodes. Messages are obtained by averaging over all replicated systems 
leading to pseudoposterior that is then used to infer the variable nodes most probable values. 
This is done with no actual replication, by introducing an assumption about correlations between 
the replicated variables and exploiting the high number of replicated systems. The algorithm was 
developed in a systematic manner to accommodate more complex correlation matrices. It was 
successfully applied to the CDMA signal detection and ILP learning problems, using the RS-like 
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correlation matrix, and to the CDMA inference problem with bi-modal Gaussian noise model 
in the IRSB-like correlation matrix. The algorithm provides superior results to other existing 

n n 

algorithms [Q, 118| and a systematic improvement where more complex correlation matrices are 
introduced, where required. 

Further research is required to fully determine the potential of the new algorithm. Two particular 
areas which we consider as particularly promising are inference problems characterised by discrete 
data variables and noise model and problems that can be mapped onto sparse graphs. Both 
activities are currently underway. 
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Appendix A: THE REPLICA SYMMETRIC (RS) ANSATZ 

Within the RS setting, the interaction term in equation ((61) is: 




A simplified expression for equation (JG]) immediately follows 




a=l \a=l 




-'-oo ^yi^fc a=l j 

where Z^^, is a normalisation constant. The diagonal elements c^^^^, only affect the normalisation 
term and can therefore be taken to zero with no loss of generality. 
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We expect the logarithm of the normahsation term Z*^ (linked to the free energy), obtained 
from the well behaved distribution P*, to be self-averaging. We therefore expect 



lim 



lim 



n— >oo n \ / n— >oo fl 

where h and qi are the mean values of the parameters drawn for some suitable distributions and 
the over-line represents the mean value of the partition function over these distributions. 
In the following we will drop the upper-index t and the sub-indices /i and k for brevity. To obtain 
the scaling behaviour of the various parameters one calculates Z {h, qi) explicitly, assuming the 
parameter qi is taken from a normal distribution A''(^i,ct^). The partition function takes the 
form : 



Z{h,qi) 



f 



exp 



{x - hy 



-|- nln (2 cosh(x)) ) . 



V27r?i \ 2qi 
Thus, the mean value of the partition function over the set of parameters is: 

Z{h,qi) = j Vg^Z{h,qi), 

where Vg-^ = dqiJ\f (gi,o'gJ . The normalisation can be expressed as: 



(Al) 



Z{h,q,) 



n 

\ ^ / n 




2a 



n 



exp < n 



2a 



h 1-—]+^ 1-—] n + — 



a 



\h\ 



n 



^1 , 3^91 

"2 



?1 , 



2a 



n 



n 



ln(2) + I /i| +n-^ + n' ^ 
z o 

where A{ri) ~ 0{1). Thus, /i ~ C (1), gi ~ O (^~^) and a^^ ~ O {n~^). >From now on we will 
take the off-diagonal elements of the RS matrix Q^^. equal to g\^k/n, where g\^^, ~ C (1). 
The form of the marginalised posterior at time t is then: 

- -r X 

a=l 



J — t 



dx exp < —n- 



2g' 



+xJ2k 



/oo 
dx exp {-n^ {x; hl^, g\^,k) } 
-oo 



(A2) 
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Figure 7: Solutions for the mean field equation (IA3P with two maxima and one minimum for a positive 
value of the field h. 



where 



In (2 cosh(x)) . 



The function <P (x; h,gi) presents one or two minima according to the following table: 



h 


91 


Number of minima 




0<gi<l 


one min. 


\h\ = he 


91 > 1 


one min. and one hump 


\h\ < he 


91 > 1 


two min. 



where he = ^/ 9i{9i — 1) — cosh^^ (v^); the coefficient gi plays the role of the inverse tempera- 
ture. Below the critical value gie = 1 a spontaneous magnetisation appears. 
This results from analysing the equation: 

d<P {x; h, gi) x — h 



dx gi 

The case of two maxima is presented in figure [7l 



tanh(x) = 0. 



(A3) 
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We define the mean values from the distribution equation (IA2I1 . If the field h is not zero, as 
shown in figure [3, [exp develops one dominant maximum as — oo. For large enough 

n, only this maximum contributes to the integrals (1A2I1 and the algorithm obtained from this 
assumption turns out to be the same as the one presented in 6|. However, if the field is 
sufficiently small it gives rise to a new regime where the two maxima contribute. At the same 
time, it is important to note that a small, non zero field favours the solution of Eq. (IA3l) that 
satisfies sgn(x) = sgYi{h). To analyse the behaviour of the field, we will explore the solutions of 
Eq. (lA3ll in the regime < <^ 1. With this aim, suppose that the solutions for the Eq. (lA3ll at 
zero field are Xq = ±gi \m\ where m = tanh (xq) and sgn(m) = sgn{h). If the field is sufficiently 
small one can expand the solutions of equation (1A3I) as x±h = ±gim + C,{m, gi)h where ^{m,gi)h 
is expected to be small and satisfies sgn (.^(m, gi)h) = sgn{h). Observe that if the field is positive 
(negative), both roots are displaced to the right (left) with respect to the zero field solutions. 
Using this expression for the roots in Eq. (lA3ll and disregarding terms of O {h?) one finds that 

gi) = ^ . (A4) 

1 - 5(1 (1 - m^) 

The expression for the exponent $ near the roots and in the < ^ 1 regime is then 
<P {x±h] h 0,gi) ~ ^ (a^o; 0, ^i) =F = ^oT i^h , and, by the definition of the m, the product 
mh is positively defined. 

Let us define (3±h {m, gi) = (1 — rn?) [1 =i= 2^ (m, gi) mh]. We expect that, for large n the following 
approximation to be valid: 

exp{-n^(x;/i^0,5i)} ^ e""'^" {e"™" exp [^i"' - (m, ^i)] (x-x,)'} 

+e-"'"'^exp{-^ [5r'-/5-h(m,5i)] (x - x_,)'} } . (A5) 
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Using equation (lASjl one can calculate the normalisation in equation (lAlll 

^ y 27r(7i^K^?i) ^-n.^>o I e"™'^ (i _ (i _ ^2^, ^2 ^^^^ 

+ e-"'"^(l + ^1 (1 - m^) (m, ^1) m/i) } . (A6) 

The mean value of a given function f{x) with respect to the conditional probability distribution 
defined in equation ( 1A2[ ) is then: 

f [Xh) + {X- Xh) f (Xh) + ^hf f" (Xh) 

_^^-i^.n{^,+mh) J^^ expj-^ [g^^ - {l-m^){l + 2^ (m, gi) mh)]{x-x.hy 

/ {x-h) + {x- x-h) f {x-h) + -{x- X_hf f" (X-h) 

which implies, considering that the integrals of the linear terms are zero and keeping only the 
leading terms in the expansions, that the expectation values takes the form: 

{f{x)\h ^ 0,g,) ^ [1 - (1 + 2e {m,g,) mh)] [f (x,) + /" (x,)} 

+e-^nmh^^^2e{m,g,) mh)f{x_h) • 

Considering the expansion of f {x±h) — f {±gim + {m, gi) h) ~ f {±gim) + 
C {i^ydi) f (ifi'i"^) h and disregarding terms of O (/le"^"'"'^) , one can write: 



{f{x)\h^O,g,) ^ / {mgi)+^^ {m,g^) f" (m(7i)-e-"[/ (mg,)- f i^g,)]+f' {mg,)i {m,g{) h. 

(A7) 
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The resulting one and two variable expectation values become 



{bfe} 



1 - 



n 



jj,k 



m 



h 



and 



where 

(tanh2(a;)|/i^fe ^ 0,^^) = (m^fc)'+e Hfe'^U) " iKk)' 
and 



Thus, the leading terms for the covariance matrix of the replicated variables are: 



ab 



If one requires the non-diagonal elements of this covariance matrix to have the same scaling as 

the inter-replica interaction matrix, the field has to behave in such a way that the exponential 

2n 



n 



term contributes at most in O {n ^) . One thus expects the field to obey ml^hlj^ < — In 

where the n^^, are appropriate constants. With this asymptotic behaviour, the expression for the 
entries in the covariance matrix is 

dlixki ij^lki 9lnk) 
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n 



1 - (Kk) 



which serves to define the probability distribution for the macroscopic variable = Xl^^fc ^/i«^f ■ 
As e^k and bl are unbiased variables, the variable A^)., by virtue of the central limit theorem, 
obeys a normal distribution, with mean value and covariance matrix given by (to highest order) 

{ri.y" ^ {A^,,Ai,) - {At > (At) = E n^* {yu^,})J2',ie,M - 



l^k 



where 



and 



l^k 



1 2 



(A9) 



l^k 

are macroscopic variables of (9(1). In particular, i?^;. is a free variable that can be used later on to 
optimise a given performance measure. This variables have the property of being self-averaging, 
therefore we can drop the sub-indices fi and k. 



Appendix B: THE ONE STEP REPLICA SYMMETRY BREAKING (IRSB) ANSATZ 

Under a solution correlation matrix that resembles the IRSB structure, the system comprises 
nL variables, where both the number of blocks L and the number of variables per block n are 
considered large. As before we are interested in the regime where L and n ^ oo. 

With this setting, the interaction term in equation ([6l) is now: 

L / n \2 / L n 

i=l \a=l / \£=1 a=l 
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/( 2 ^ 2 ^ n 



thus we have now L + 1 squared sums in the exponent that can be replaced by integrals: 

where Ag*^ = q\^j^ — ql^f. > and x'^ = {xq, Xi, . . . , xl)- Also here we expect the logarithm of 
the normalisation term (linked to the free energy) obtained from the well behaved distribution 
P* to be self-averaging, thus: 

£^ il^ ^ (^) = tL tL ^ (^^^ ^^'^^)) ' 

which is satisfied if the entries behave like gg/^fc ~ difik/''^^ Ag^^ ~ fi'i^fc/^; where g\^i^ 
and (/g/ifc ~ ^(1)- Using this new scaled parameters, the expression for the normalisation is 
= / dx exp { -nL$ (x; h^^f^, g{^,^, gl^f.) ] where 

^ (x; aU) = ^ + tYI ^ - t 5Z cosh (xo + + /i^,)] . 

yi^k i=\ y2^k £=1 

As before, we drop the indexes fi, k, and t for brevity. The critical points of the function 
<P (x; h, gi, (?2) satisfy the following set of equations: 

L 







1 


dxQ 
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" L 






( Xi 


dxi 




\92 



— tanh [xq + X£ + /i) ) = , 
which are satisfied for the following values: 



1 ^ 

(?2 ^ ^ ' 



5'2 V 92 



tanh (x* + ^x* + /i ) , (Bl) 



where x* = j J2e=i ^*e- The second equation in the set, equation f IBll l. has the same form for all 
£ = 1, . . . , L and in the small field regime it has at most three different solutions. From the three 
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possible solutions, one is a local maximum; of the other two, the one that has the same sign as 
h is dominant. Thus we can expect, for all i, x} = x*. This reduces the set of L + 1 equations 
to one 

— = tanh ( — X* + h] , 

92 \92 J 

where G = gi + g2. With the substitution u = {G/g2)x* the equation has the same form as 
equation ( 1A3I ). i.e. u = G tanh {u + h) . If one considers again the field h to be small, the 
solutions can be expressed as an expansion of the zero field solutions u±h — ±Gm + ^{m,G)h, 
where ^{m,G) is given by equation (IA4I) . and sgn(m) = sgn{h). Using these expansions the 
critical values are given by: Xq ~ gi [±m + G~^^{m, G)h] and x}^f^ ^ g2 [±.m + G~^^{m, G)h] 
for alH= 1,...,L. 

As in the RS case, the expansion of <P around the critical points in the small field regime is 
$ (x^;^; h ^ 0, gi, g2) — ^ (xq; 0, gi, (72) =F = T ^h. So the dominant solution is the one 
that shares the sign with the field. 

For a sufficiently large system with nL variables, one expects the following expansion to be valid: 

nL 



exp {-riL^ (x; h ^ 0, 51, 52)} ^ e""^'^" <^ e"^'"'^ exp 



2 

nL 



{^-^hfti<P,h (x-x^) 



B2) 



where ii.p,±h is the Hessian of ^ in x^^. 

Defining P±h = (1 — m^) {1 =F 2 [(.{m, G) + 1] mh}, the entries of the Hessian become 



dxl 



^ 9i ^ 



I3±h = Ol±h 



dxj 



1 



{92 ^ - P±h) 



-l±h 



dxodx£ 
dxtdxfi 



. 
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The corresponding characteristic equation is: 

L-l 

L J L 

The solutions for this equation, disregarding terms of O {L""^) and O (hL"^), are 



det (H^,±h - Al) = ( ^-f±h - A 



{a±h - A) ( t7±/i - a ) - 



0. 



Ao,±/i — Q;±fe + 



10: 



±h 



La±h 

Ao {1 ± 2 [e(m, 5i) - 1] [e(m, G) + 1] (l - m") mh) 



Ai,±/i = T 7±/i I 



(B3) 



~ Ai M ± 2 



((//;. j/i)^ [((//;. 

'i-[eK5i)-i][eK^2)-i] 



[C(m, G) + 1] (1 - m^) m/i 



A^,±/i — ^7±/i 



~ A,{l±2[e(m,y2)-l][eKG') + l](l-m2) m/i} V£ = 2,...,L, 
where \q = + j Xi = ^7o — and A^ = 2;7o. The corresponding eigenvectors, up to 



order L ^, are: 



Uo,±/i 



L times 



1 



1 A 



^e,±h — 



1 / /3i 



L times 



(B4) 



-1 times 



L— ^ times 



0,1,1,..., !,-(£-!), 0,0,. ..,0 ye=2,...,L. 



These vectors satisfy the normalisation condition uJ^f^U£i^±h = 6^^' [I + O {L ^)] V£, = 
0,1,..., L. The linear transformation from the canonical basis to the basis of eigenvectors is 
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then represented by a matrix with the entries 

1 



(U±.) 



(Uo) 



i-1 



^(^ Lfe=i 
^oi — + (1 - 5oj) 

"0 



5^4^-(l-5o,)(l-5i,)5.,(j-l) 



- y ^oi (1 - Soi) — 
L ao 



(B5) 



ignoring terms of O [hL ^/^) . Because this transformation is a rigid rotation, the following 
properties are satisfied: |det (U±)| = 1 and U^^U±fe = U±/iU^;j = 1. 



Second order terms in equation (IB2I1 can be re-written using the diagonal rep- 
resentation of the Hessian. Therefore, keeping only terms of order O {L~^) we 
have that: (x - x±/j)"^ H<j,,±/j (x - x±/,) = (x - x±/j)"^ UoUjH<2i,±feUoUj (x - x±,,) = 
(y - y±hV ^'<p,±h (y - y±h), where y = Ujx and H'^^^h = UjH<2i,±/,Uo is the diagonal represen- 
tation of the Hessian, i.e. (H^ _|_^)^^. = 6ijXi^±h. Using the diagonal representation in conjunction 
with equation (IB2I) one obtains an expression for the normalisation term 



Z{h-^ 0,5(1,5(2) 



-riL^'Po—'mh) 



dy exp 



^g-nL(<Po+mh) y"dyexp 



^ (y - y-hYti'^^^h (y - y-h) 



L+l 

2 



\nL J 



^nLmh -Q ^ ^-nLmh "Q 



For a small field, the product of the eigenvalues can be approximated by 

L L 



.±h 



{1 T Um, 92) - 1] Urn, G) + 1] Lmh} \{ \ 



1=0 



Thus, the expression for Z reduces to 



Z{h^ 0,51,52) 



L + l r 
f27T\— 



nLmh 



\nL J 



{l-[am,g2)-l][^{m,G) + l]Lmh} 



+ e 



-nLmh 



{1 + [e(m,52) - 1] [eKG) + 1] Lmh}} 
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The mean value of a given function /(x) is then given by 

{m\h^0,g,,g2) ^ I dyexp|-^(y-y*)"^H^,,(y-y*)| 



/(x.) + ^(y-yD^H'^,,(y-yD 
/ dy exp {-^ (y - y\y K,-H (y - yl.) } 

where 'ii'f^±h the Hessian of the function / in the basis of eigenvectors of H^-t^, evaluated at the 
critical points. The linear terms in the expansion of / (x) do not contribute to the expectation 
value. The Gaussian integral of the cross products of the type (i/i — y*^±fi) [yj — y*j^±h) with i ^ j 
are zero, thus the Gaussian integral of the second term in the expansion of / (x) becomes: 

i± - lz-e-^(^o^-'^) j dyexp[-!^(y-yi,)^H:,,^,(y-y±.)] (y - y±.) " H^,±. (y " yi.) 
1 1 ^ 

^ ^ {1 - e-^""™^ {1 + 2 [e(m, g,) - 1] [^(m, G) + 1] Lmh}] — ^ Xj), (H'^,,)^^ 



I -nL{<Po+mh) 



2 

2 



nL 



1 ^ 

{1 + 2 [e(m, g,) - 1] [e(m, G) + 1] Lmh} — J] Xj^i, (H'^,_,) 



(B6) 



Using the expansion / (x±) ~ / (±Xo + h £(m, G)) ~ / (±Xo) +/i ^im, G)Vf (±Xo) = / (±xo) + 

L times 

5/(±xo) h where xj = m I ^i, 5-2, 52, • ■■,92] and ^"^(m,G) = G-^i{m,G) I 51, '52, 52 ■■,92 
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the diagonal entries of the transformed Hessian are: 



{^'f,±h)u = 5Z {'iif,±h 

i,j=0 
i,j=0 

L / g2f 



dxidxj 



x± 



i,i=0 



+ /iC^(m,G')V— ^ 

±xo OXiOXj 



(B7) 



H' 



/ 1 ±xo 



h 



/l±xo^ defined by the second term in ((BTj). Using the entries of the diagonalised 



with (^5H'j 

Hessian, the last term in the integrals (IBGO becomes 

L 

^ \ ~i (w \ ''■ i\ I 

/l±xo 



00 n (^ao7o 



H' 



/32) V ^lixoAi 



1 1 



n 7o 



/l±Xo 



^=0 " V " ' u/ 

disregarding terms of C + j;) . The expectation value of an arbitrary function / can then be 
approximated by 



£=0 



where we have disregarded terms of O (J + |),C» [l.Q-'^^Lmh^^ q (^j^i^^-2nLmhy gy simple 
inspection, equation (1B8I) is equivalent to the RS mean value equation (1A7I1 . 
The single variable mean value is then: 

{bkKk ^ 0' al^k) = J2p' (bfel {yu^^.}) bi' = (tanh {xo + xe + hi,) \hl, 0, g{^„ g^^,) . 
The expansion for /(x) = tanh (xq + + /i^^) is 



-1 times L—£ times 



/(x)=.m;.,+ l-KO' 1,0,0,... ,0,1,0,0,. ..,0 ^K„Gy/i 
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which results in the following expression for the single variable mean value 



1 - 2e 



-m 



Ilk 



A:=0 



where (Mi 



0£j 



^oi^Qj + '^oj'^£i + (^^jf^oj + ^li^ij is a matrix such that H 



xo 



1 — (^^fc) Mq^. In the basis of the H.^^ eigenvalues, the expressions for the diagonal 
elements of this matrix are 



(M'c 



Ot)kk 



(M'o,)oo 



i,j=0 

{{lJ±k)ok + i^±H)ekf 
2 Po 



(M'c 



1 - 
1 

L 



L ao 



(B9) 



1 



V£ = 2 L, 



A; (A; - 1) 

where 0{n) = 1 if n > and otherwise. The sum of the eigenvalues' inverse times the diagonal 
elements equation (1B9I) results in 



fc=0 



--E 

k=2 



6k/-j^ + 0{k-i-l) 



k{k-l) 



1 1 
nL ao 



1 



ao7o - Po 



1 1 

^ 7o 



E 

.A:=2 



1 



1) 



+ 



1 1 

nL Oq 



1 + 



"07 - Pi 



11 11 

n 7o nL ao 



1 + 



(qq + Pof _ Oo 

ao7o - Pi 7o 



— ~92tik^{''^likj92fik)~^^^ [^^fik^ i'T^^fik^^^fik) oifik^ {'^^fikJ difik)] 

where we have used that Efe=2 - 1)]"^ = {L - 1)/L, 7^^ = ^a^fc^ ("^m^' ^s^fc) ^^d 
^ ^ _^ (tto_+A)f _ oo 
ao ao7o - /^o 7o 



G^ki - gl^i^i {m%9l^k) ■ The final expression for the 
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expectation value of a single variable is 



dlfik^ {^^fiki dlfik) 



n 



1- {KkY 



m 



^^tik^ {^^tiki ^^tik) 92fik^ {^^iiki 92tik) 



nL 



m 



^k 



(BIO) 



To calculate (^^kh^k \h*^^. 0, (7*^,, Z\(yf*^), an off-diagonal element (a 7^ a') in the same block i, 
we can apply the equation (IB8I) with /(x) = tanh^ (xq + xe + hj^f.), thus the Hessian matrix is 



Htanh2{a;o+Xf) Ixq ~ ^ 



Mo^, thus: 



G'^fc^ ("^^fc, G^fc) - ^2Mfc^ i^U^ 9i^k) 



nL 



''T^^fik^^fj.k- 



(Ell) 



Finally, to calculate the expectation value for the product of two variables belonging to different 
blocks i i' (the sub-block index a is insignificant in this case), (^k'bk^\h^^k ~^ 9ifik^ di/ik)- We 
set /(x) = tanh (xq + xi + h^^j^) tanh (xq + + /i^^), thus the Hessian matrix 



H 



tanh(xo+a;^) tanh(zo+X£/) 



+Mi (m^^f^) {5ieSje> + 5ii>5je) - 2M2 (mj^^.) {5ie5je + 5ii>5ji^) , 



where Mo (m^^) = 1 - {m^^kf 1 - 3 (m^fc)^ , Mi {m^^i,) = \ l - {m^^k)\ and M2 {m^f,k) 
{''^^fikY 1 ~ ("^Aifc)^ ■ '^^^ diagonal elements JCee-k = ( H[ 



tanh(2:o+x^) tanh(xo+a;£/) 



in the basis 
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of eigenvectors of H,^ are 



2-Mo 



"0 



"0 

JU - 1) 



—25 if 



-^1 Hk) ^2 Kfc) _ ^ 



1 



JU - 1) 



thus, the sum of the diagonal elements is: 



9 «r Z^^fc,-±/^'^ 



fc=0 



1 -^oHfc) 



1 + 



(ao7o - A 



■--<1-M2K,) 



-+E 



("^Jtfc) 5'2/ifc) 



j(j - 1) 
1) 



[-^1 i^U) --^0 Wfc)] 



, ^^ij,k^ {'^^fiky^^fik) 92fj,k^ i^^fiky 92^J.k) / t \ 

+ Mo [m^^) . 

Using the sum of diagonal terms one then derives the expected correlation for variables belonging 
to two different blocks 



^k 



( t \2 „ ("^^jfc) fl'2^tfc) 



^^k) - [Kk) [1 - Hfe) 

G^ki {^U^ G'^k) - ai^^ki 9i^,k) 



riL 



1 - 3 (ml,) J [1 - (ml,) 

(B12) 



Keeping in mind that {b^^b^kl^^fik^ 9ifik^ dif^k) — ^ using equations (IB10P - (IB12l) . the covariance 
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matrix entries can be then calculated 



ea.e'a.' 



1 - Hk) 



W raa' 



+ 1-5 



+ 5'' (1-5 



'\ 92nk^ {^^fik-i 5'2/ifc) 



'\ ("^Aifc' ^Mfc) " 92fik^ i^^fik^ 92fik) 

nL 



where we have kept only the dominant terms at each entry, disregarding terms of order 



-2nm ,,h 



n L 



If the e^k and 6^ are unbiased variables, the variable Zi^^ = ^^^fc^^i&f, by virtue of the central 
limit theorem, obeys a normal distribution, with mean value and covariance matrix that can be 
obtained by employing the expressions derived for ^ 



{ht^,} l^k l^k l^k 



(B13) 



l=/=k 



£a£'a' 



n 



1-6 



nL 



where X*^ is given by equations (IA9I) and 



l^k 



1 2 



l^k 

are macroscopic variables of 0{1). In particular, R'^^i^ and V^^f. are free variables that can be used 
to optimise a given performance measure. 
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Appendix C: THE MESSAGES 

From the conditional probabilities of equations (l3|) and dH) and with the application of the 
probability distributions P(A^fc|B) of equation ([8]) in ([5]) we can express the message from 
nodes to nodes 6^ at time t + 1 as: 

n 

-S^ - (CD 

{B} a=l l^k 
^ {bfc} 

I dA^kP{A^u\B)Y,P{y^^^.k\l) [l + ^MfcbJVA,, lnP(|/^|A^fc;7)] 

{bfc} 

If P{yfj\A^k]l) = 111=1 P {Vfil^'l.k'^l), and ignoring C(£^fc) terms, the traces on can be 
written as 

EP(l/^|A^fc;7) [l + £^fcbIVA,,lnP(i/^|A^fc;7)] = 2"P A^^; 7) 

J2btPiy,\^,k-n) [l + £^,bIVA,,lnP(y^|A^,;7)] = 2%,P A^,; 7) In P (y^l^,; 7) 

{b.} ^^/^^ 
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thus, following from (IClll and neglecting 0{l/n) terms 

I dA,,P (A,,|B) n P {y,\Al,- 7) ^ InP [yAA%- 7 

a=l ^'■^ 



(RS)-t+l ^ c- 



a=l 



d/^ exp I --^^^ + In P (yJZ\; 7) 



n n-1 



X / dZ\exp 



and 



I 2X* J 9Z 



/ dA,,P(A,,|B)nnP(y,|4-7) ^lnP(|/,|4r;7 

>^ £=1 a=l ^^A'fc ^ 

n 

dA,,P(A,,|B)n^ {y,\Al,n) 



a=l 



n 



y*-P* - L-1 (\/* - P*) 



xn 



J dZ\ exp I -^^^^^ + In P(yjZ\; 7) 



dA exp 



2X* 



lnP(y^|A7) 



n-l 



X / dZ\exp 



I 2X* j dA 
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where '•^^''^fc and ^^^^^^^^ are suitable normalisation constants and t?°f = + + u^^. One 
can then define: 



1-1 



2X* 



dZ\ exp 



2X* dA 



9 



2X* 



+ 



yt-Ri \/t _ L-i (r* - i?*) 
Thus the expression for the RS message is: 



(RS)-t+l _ 



dT? exp{-n(^s)7^(Al/M)}P(l/,,^) 



d^9 exp{-n(^^)7^ (^9,1/^)} 



(C3) 



(C4) 
(C5) 

(C6) 



d 

In the large n limit, only the solutions ^^^/^ of — "^^^^T-^ = 0, that correspond to the minimum 
of H contribute to the integral. The dominant term in the integral is obtained via saddle point 
methods, which leads to the final expression for the message 



'""'Hi' 



'fik ' 



(C7) 



R' ' 

where ^9^^ is given by equation ( IDlll . 

The IRSB case is a little more delicate. The exponential is a sum over L 
functions^^^s^)?^ Therefore, a Taylor expansion close to the saddle point of equa- 
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/ ho 


hi ] 


H 


Ui 


h2 j 





\ dV 






1 




; 



tion (ID2I) is employed resulting in 

e=i 1=1 e=i 

where Eq = (^^sb)-^ ^t^^l, i^^^, is the energy of the ground state, At?* = t?* — t?^*^ i = 0,i and 
the entries /iq, /^i and /i2 satisfy the equation 

(y* - RT^ 
(\/*)"' 

where is the solution of equation (lD2l) . If ©"■" = (^9°, . . . , ■(9'^) and 

(Ht^).^. = 6jk[6joho + {I - 6jo) L''^h2] + (5jo + 4o) (1 - ^jfc) -^"^/^i is the Hessian of 
Et/^^^^)^K,<1/M),then 

^ /- 

£=1 

The matrix has the same structure as H^, therefore, the eigenvalues and eigenvectors of 
H-H can be obtained adapting equations (IB3I) and (IB4I1 by the substitutions = ^Oi ~/?o = hi 
and 7o = /i2- Expanding V{d,y^ at the saddle point 'i?^^* one obtains (^9°^*, y^) — 'Po + 
Pi (A^^o + A?9^') + (A^^o + A?9^') where = 



. The resulting messages are 



(lRSB)-t+l _ ^ 



j d@ exp |-^A0"^H^A0| + Pi (a^9° + A^9^') + (a# + A?9^' 



y df2 exp |-^A0^H^Ae| 



where the term proportional to Pi vanishes for parity reasons. In the basis of eigenvectors of 
H'^^, i.e. r = U"""© = (70, 7i, • • • ,7l)^ where U is adapted from equation (1B5I) . the message has 
the form: 



(IRSB) -t+l ^ 

'"'Ilk — ^Atfc 



j dr exp |-^|Ja< (A7')'| (Po + lP2ArTMi,,Ar) 
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where are the eigenvalues of Hn and Mq^, is adapted from equation (IBOO . 
The expression for the message is reduced to 



(1RSB)^J+1 ^ 



Kk-%k e^k V2V' 
- 2V' -R* 2n 1- ViV' ' 

The expression for the messages from b-nodes to y-nodes is: 



m 



{bfe} 



E^^'H E Piyu\B)llP'-'{h\{y^^^}) 

" EH E P{y.\B)l[P'-Hhi\{y.^u}) ' 

{bfe} i^T^M |b,^fc} '^^'^ 

which can be approximated by 

Id^ukP{yu\^ukn)P{Auk\B) [l + e.fcbJVA., lnP(y,|A,fc;7)] 

{bfc} 



{bfc} 



d 

^ + ^'^kb'l:-^^\nP{y^\A^k-n) 



J2 /dA,feP(y.|A,,;7)P(A,fe|6f) 

1 + m* 



9 

^ + ^:^kbtw-r^\nP{y^\A^k;-r) 



1 + mlJ)l 



uk 



n 



1 - Kk 



uk 



En 



Art 



tanh j arctanh (^tfc) 



'^k j^^^ i^fc 



but since rhlf, ~ (9 (ejyfc) we have that 



m^,k - 



tanh mtfc 
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Appendix D: THE SADDLE POINT OF H 



For the RS case the equation to be solved is: 

^iik 

thus, the equation to be satisfied is: 

= ul, + R'v{d%,y,) . (Dl) 
d d 

For the IRSB case we have that 77^77^^^^^^^ = 77777^^^^^^^ = , resulting in the set of equations: 



= ^i-{V'-R')v[^'^Ly,) 
= Kk-V'v[^%y,) , 



which is equivalent to: 



^7 = uU+{2V'-R')v{^lly,) , (D2) 



where ??°f = + 7?^ + w^a:- Observed that equation ( 1D2I) is equivalent to equation (IDlll and that 
the ground state ^^^^^ is independent of the indices and £. 

Appendix E: THE OPTIMISATION CONDITION 

Our goal is to devise an algorithm that returns a better estimate of the message at each iteration; 
we therefore apply a variational approach that optimises the free parameters of the model at each 
iteration. We expect to find a suitable set of parameters that maximises the drop in error per 
bit rate. 
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The error function has the form 



^* (7) = ypl - MVViv* 



(El) 



where is a positive constant. 
Observe that 



V27rF* 



dz exp 



^2 + (E* 

2F* 



In cosh(z) 



tanh(z) sinh 



tanh(z) sinh 



F* - F* 



sgn (F* — F*) Vz. Therefore sgn (F* — F*^ 



and that sgn 
sgn (M* -iV*). 

The second term of the right hand side of equation (lEll) is an imphcit function of the parameters 
7 through F* and F*, therefore 

^7* 



+ 



'N^J 9F* J d'ji 9F* J dy, 

where the partial derivatives with respect to F* and F* are 
5 / M* 



(E2) 



aF* 

d f 



Vz 



1 - tanh^ VF^z + E' 



N' - M* tanh VF^z + F 



3 / z 
^ I Vz 



2VF* 



1 - tanh^ 



^F^z + F* 



A^* - M* tanh VF*^ + F 



By the definition of the field ft/c^^fe we have that sgn (bkh^^k) — ^gn (^fc^^^fc) = sgn(6fcm^). Ex- 
ploiting Gaussian properties of the distribution of /i* ^ (JQ]) 



fc=i 

°° dw 



-00 V27rF* 



exp 



(M--F*f 

2F* 



;i - sgn(M)) 
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and we suppose that and F* are both exphcit functions of the parameters 7, therefore 



27rF* 



exp 



t\2 



2F* 



dE^ 1 



By differentiation equation ( lEll) and using equation (IE2I ) one obtains 



A2 



AT*) 



exp 



2F* 



9^* 1 dF^ 



n-f 



iV* - M* tanh ( \/F*z + E' 



Vz- 



du 



'dE^ 1 F* 



cosh 



exp 



F*z + F* 



fu- F* 



2F* 



2\/F* 



u - Af* tanh (u) 
2 cosh^ (u) 



97i 2 F* 97i 



27rF* 



exp 



(Ff 



2F* 



+ 



27rF* (iV* 



: exp 



(n-F*) 
2F* 



iV*-M*tanh {u) 



cosh (li) 



(E3) 



To optimise S'^ with respect to 7^ one requires = 0. The first term of the right hand side 

of equation (jESj) is independent of the index i and is zero if and only if the integrand is an odd 
function. This is true if tanh(M) = tanh ( j \lu G M. This condition is only satisfied 
if F* (7^^) = F* (7'^) which automatically makes M* = N^. By the application of this condition, 
the sum between curly 
positive, which implies 



the sum between curly brackets in the second term at the right hand side of Eq. (IE3l) is always 

9F* 1 F* aF* ^ 



d^i 2 F* d% 



The conditions F* (7"^) = F* (7^^) and 



7, 
9F* 



1 F* (9F* 



imply that: 



7 



a7i 2 F* 97, 

InF* = eo + e]"(7-7^) + i(7-7^)^ £2(7-7') + ... 
InF* = eo + 2e]"(7-y) + ^(7-y)"'F2(7-7') + ... , 
therefore, if the critical point is a minimum, then the expansion F*/ \/F* 
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exp ||eo + I (7 — 7^^)^ (E2 — IF2) (7 — 7'') + . . . | has a second term that satisfy the conditions: 
det (E2 — IF2) > and (E2 — ^^2)^. < 0, validating the optimisation process. 
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